Assessing spatiotemporal variation in abundance: A flexible framework accounting for sampling bias with an application to common pochard (Aythya ferina)

Abstract Assessing trends in the relative abundance of populations is a key yet complex issue for management and conservation. This is a major aim of many large‐scale censusing schemes such as the International Waterbird Count (IWC). However, owing to the lack of sampling strategy and standardization, such schemes likely suffer from biases due to spatial heterogeneity in sampling effort. Despite huge improvements of the statistical tools that allow tackling these statistical issues (e.g., GLMM, Bayesian inference), many conservationists still prefer to rely on stand‐alone turn‐key statistical tools, often violating the prerequisites put forward by the developers of these tools. Here, we propose a straightforward and flexible approach to tackle the typical statistical issues one can encounter when analyzing count data of monitoring schemes such as the IWC. We rely on IWC counts of the declining common pochard populations of the Northwest European flyway as a case study (period 2002–2012). To standardize the size of sampling units and mitigate spatial autocorrelation, we grouped sampling sites using a 75 × 75 km grid cells overlaid over the flyway of interest. Then, we used a hierarchical modeling approach, assessing population trends with random effects at two spatial scales (grid cells, and sites within grid cells) in order to derive spatialized values and to compute the average population trend at the whole flyway scale. Our approach allowed to tackle many statistical issues inherent to this type of analysis but often neglected, including spatial autocorrelation. Concerning the case study, our main findings are that: (1) the northwestern population of common pochards experienced a steep decline (4.9% per year over the 2002–2012 period); (2) the decline was more pronounced at high than low latitude (11.6% and 0.5% per year at 60° and 46° of latitude, respectively); and, (3) the decline was independent of the initial number of individuals in a given site (random across sites). Beyond the case study of the common pochard, our study provides a conceptual statistical framework for estimating and assessing potential drivers of population trends at various spatial scales.


| INTRODUC TI ON
Wildlife populations are often distributed over huge areas, especially in migratory species (e.g., common pochards occupy continental flyways, Folliot et al., 2018), making difficult the implementation of reliable strategies for assessing their abundance and temporal trends (Jarman et al., 1996). Counting migratory birds, whose distribution often encompasses whole continents, requires the coordination of censuses and surveys over numerous entities (states, provinces, and countries) not necessarily sharing the same management priorities or conservation status for the same species. Nevertheless, the monitoring of migratory birds has a long tradition of coordinated "transnational" population censuses (Amano et al., 2018;Gregory et al., 2007;Lehikoinen et al., 2013;Sayoud et al., 2016). For example, Wetlands International coordinates the International Waterbird Census (IWC) over more than 100 countries around the world since 1967(Amano et al., 2018. In the Palearctic region, the management of Anatidae species is conducted at the flyway level based on the IWC (censuses performed around January 15, Elmberg et al., 2006). These counts, together with censuses performed during the breeding season, serve to establish the International Union for Conservation of Nature (IUCN)'s red lists (IUCN, 2020), and implement management actions. In Western Europe, the red list status of harvested species helps implementing hunting policies by member countries of such agreements as the African-Eurasian Waterbird Agreement (AEWA).
Large-scale censuses and surveys also serve scientific purposes. The internationally coordinated censuses, such as IWC, the Mid-Winter Survey, or the Christmas Bird Count (MWS or CBC respectively, its North American counterparts) and the Pan-European Common Bird Monitoring Scheme (PECBMS), have recently been used to assess the efficiency of conservation policies (Amano et al., 2018;Gaget et al., 2020;Jørgensen et al., 2016), birds' response to climate change (Lehikoinen et al., 2013;Meehan et al., 2021;Pavón-Jordán et al., 2019), or both (Gaget et al., 2018).
Unfortunately, international bird censuses, as those carried out in the Palearctic area, potentially suffer weaknesses that should be addressed adequately in the analyses. The methods required to model the population size from the censuses data depend on the type of collected data. In most large-scale monitoring, models used to infer population relative abundance rely on the assumption that the population abundance estimated in a site can be used to infer the abundance in the surrounding area. For example, most models of the CBC data suppose that the counted numbers of birds in all sites in a given region ("stratum") are similar enough to allow the inference of an average relative abundance for the region (Soykan et al., 2016). Models of North American Breeding Bird Survey (BBS) rely on a similar assumption, with all the routes belonging to a given region used to estimate the average relative abundance of the birds in this region (Link & Sauer, 1998;Sauer & Link, 2011). This similarity of bird abundances on routes close to each other is often more explicitly accounted for using spatial models (Bled et al., 2013;Thogmartin et al., 2004).
However, considering waterbirds, we cannot expect that bird abundances are similar in close locations: the number of ducks is likely to be very different on a small pond and on the neighboring lake. Therefore, a model of censuses cannot suppose that the abundance of water birds on a given water point is similar to the abundance on water points of the surrounding region, precluding the use of classical regression models such as those developed for BBS or CBC. On the other hand, even if adjacent entities are not characterized by similar bird abundances, they may potentially be sharing the same demographic characteristics, in particular population trends (Knape & de Valpine, 2012). This suggests that a modeling approach allowing to infer average population trends in regions without attempting to estimate the population abundance might be useful for such species.
When the IWC scheme was implemented, each country was asked to maximize the inferential potential of their monitoring efforts by focusing on more abundant sites to allow for more accurate estimation of populations trends, but no other sampling strategy was proposed to help achieving these goals (Delany, 2010). Therefore, censuses schemes were initially designed to be easily implemented at large scale to involve as many countries/territories as possible and almost entirely lacked a real sampling strategy. As a result, the managers in charge of the counts tended to prefer biodiversity hotspots for monitoring. Sampled sites may differ in size, densities of individuals, observation pressure, skill of observers, census methods, and so on, yet these differences are most of the time completely ignored in the analyses. As a consequence, overdispersion often "plague" the datasets and may seriously affect both estimate values and variable selection. Indeed, unaccounted overdispersion leads to an overestimated precision of estimates, and thereby to an inflated type I error rate (e.g., a random between-year variation in the estimated population size considered as a significant change in actual population size, or an environmental variable wrongly considered as having a significant effect on the population size; Cameron & Trivedi, 1998, chap. 3). For this reason, most authors include overdispersion residuals in their models of census data (Link & Sauer, 1998;Sauer & Link, 2011) or use statistical distributions other than Poisson (e.g., quasi-Poisson approaches, Pannekoek & van Strien, 2001). Finally, some data may be missing (Komdeur et al., 1992) leading to some difficulties in the inference of the population size and trends.

K E Y W O R D S
common pochard, ducks, hierarchical modeling, population trends, sampling bias, spatial autocorrelation

Spatial ecology
In summary, the lack of sampling strategy and standardization over time and space, together with the other problems mentioned above, may lead to important biases that should be considered in the analyses of wildlife census data. To date, the assessment of population trends often relies on a "standalone turn-key solution" (usually TRIM software, Lehikoinen et al., 2013;Musil et al., 2011;Musilová et al., 2018), which has proven invaluable but suffers from drawbacks (Amano et al., 2018or Meehan et al., 2021. TRIM tackles overdispersion but does not allow accounting for the problem of spatial autocorrelation in the population trends. However, most TRIM users, so far, simply ignored spatial autocorrelation together with the possibility to tackle overdispersion, meaning that in numerous instances the models did not fit the data. Trying to circumvent this last problem, some authors opted for a "last resort" solution Here, we present a flexible statistical approach aimed at providing unbiased estimates of temporal trends in numbers derived from censuses collected in different countries and hence, potentially suffering the same biases as discussed above. We use the IWC data of common pochard (Aythya ferina -hereafter pochard) in the Northwestern European flyway as a case study. This diving duck is considered to have experienced a sharp decline over most of its range including the focal flyway (mean annual declines of −5.97% and −2.16% in the Northwestern and Central European Flyways, respectively, as estimated with TRIM, 2003, Nagy et al., 2014. Such unfavorable status has caused the species to be upgraded from Least Concern (LC) to Vulnerable (VU) on the European and global IUCN Red Lists in 2015(BirdLife International, 2015, and demographic studies to be initiated Folliot et al., 2017Folliot et al., , 2018Folliot et al., , 2020Gourlay-Larour et al., 2012, 2013Keller, 2009).
The aims of the study were as follows: (1) to exemplify how biased a typical IWC dataset can be, (2) to develop a hierarchical modeling framework that adequately considers overdispersion and spatial autocorrelation in relative abundance trends, and (3) to assess within this framework the trends of pochard populations in Northwestern Europe, taking into account possible differences between the different parts of the wintering range in this area. We selected the IWC's pochard dataset as a case study both because it included the typical biases of interest mentioned above, and because these biases potentially plague many waterbird count datasets including those of IWC.

| MATERIAL S AND ME THODS
The pochard is a hunted diving duck whose distribution encompasses the Palearctic region (i.e., Western Eurasia and North Africa), mostly above 40° of latitude (Folliot et al., 2018;Fox et al., 2016).
We focused on pochard populations of the Northwestern European flyway. For the sake of simplicity (to avoid tackling, e.g., threshold effects often characterizing long-term count time series) and comparison with previous estimates, we focused on the January IWC counts of the 2002-2012 decade. During this period, the dataset covered 3273 sites counted at least once. Since it is difficult to assess the long-term trend of population size on a site using only a few counts, we focused our modeling approach on those sites counted at least 60% of the time (7 times) during the study period, representing a total of 981 sites ( Figure 1).

| Assessing the nature of the problem
Before designing the statistical approach for the assessment of the population trend, we first fitted a preliminary simple Poisson generalized linear mixed model of the number of pochards on a site during a given year as a linear function of the year. This model included a site-specific intercept and a random effect of the site on the slope of the year. Indeed, while the slopes of N sites can be considered as N noisy measurements of an average trend, the intercepts (average number of individuals) may display consistent differences between sites due to, for example, differences in site size area. This model also included overdispersion residuals, and was fitted using the package TMB for the R software (a package that accommodate both large volume of data and large number of parameters, Kristensen et al., 2016).
This analysis revealed that this period was characterized by a steep linear decline in the number of individuals counted (mean slope equal to −0.04, SE = 0.005; although this SE probably overestimates the precision of the slope, as it does not account for the spatial autocorrelation in the trends, see below). We assessed the spatial autocorrelation of the estimates of regression slopes estimated for each site with a Moran test (Cliff & Ord, 1981) based on a relative neighborhood graph (Toussaint, 1980, spdep package -Bivand et al., 2015. Moran's test detected highly significant but weakly positive spatial autocorrelation (Moran statistic I = 0.08, p = .006), meaning that neighboring sites tended to display similar temporal trends. A map showing the distribution of the sampled sites suggests that this spatial autocorrelation was due to the spatial heterogeneity of sampling effort, with some areas being oversampled ( Figure 1). In some areas, the distance between neighboring sampling sites was in some cases lower than 100 m, leading to a high probability that the same groups of birds were counted on neighboring sites. Not accounting for this unequal sampling intensity would lead to a disproportionate weight of oversampled areas in the analysis, leading to overall trend estimates strongly affected by the local trends observed in these areas.

| Using a grid to ensure the standardization of sampling units
We designed a hierarchical modeling approach to account for spatial autocorrelation and non-random site selection (sampling heterogeneity). To ensure the reproducibility of the analyses carried out in this study, we included all the code and data used for this modeling approach in the R package pochardTrend (Digital Object Identifier: https://doi.org/10.5281/zenodo.5710550) on GitHub at the following URL: https://github.com/Cleme ntCal enge/pocha rdTrend. The reader can install this package in R with the package devtools (Wickham et al., 2021), using the function devtools::install_github ("ClementCalenge/pochardTrend", ref="main"). The pochardTrend package includes a vignette describing how the user can easily reproduce the model fit (available with the command vignette ("pochardTrend"), once the package has been installed). We wrote this vignette to provide the user with the essential information on the model. This vignette also describes the model checks and residual analysis carried out for the obtained model (see below).
To account for spatial sampling heterogeneity, we "standardized" the size of sampling areas using a grid made of 193 (75 × 75 km) grid cells encompassing the flyway of interest ( Figure 1). Each grid cell included at least 1 of the 981 sites retained in the analyses (see above).
We opted for such a spatial resolution because it offered the best compromise between a low resolution and an overly high one that would have included non-sampled territories (grid cells). Therefore, each "sampling unit" covered a standard area and all the sampling units had the same weight in the analysis. Shifting from a set of sampled points with an irregular geographical repartition to a lattice process is a common strategy to standardize the area sampling unit and give them the same weight in the analysis (Bled et al., 2013).

| Structure of the hierarchical model
A major aim of our study was to develop a hierarchical model for estimating population trends from IWC-like data. We therefore fitted a model of the form: Where N it is the number of individuals detected in site i during year t. We described these numbers using a Poisson distribution with mean it , supposing the following model for this parameter: where is the slope of the latitude L q(i) of the grid cell q(i) containing the site i and b is the standard deviation of this Gaussian distribution.
Here, we modeled the trend as a function of the latitude only, but it would in theory be possible to model the slope as a function of other environmental variables (e.g., land cover change). We focused on the latitude to assess possible differences in trends over the species' range on a North-South axis due to migratory short stopping (shortening of migration distance leading to increasing numbers at high latitudes and decreasing numbers at low latitudes) in response (1) F I G U R E 1 Spatial distribution of the 981 sites surveyed at least 7 years between 2002 and 2012 (red circles) located in the northwestern flyway with the juxtaposition of 193 grid cells (75 × 75 km) when at least one site surveyed is present for the monitoring of the common pochard (Aythya ferina) to increasing winter temperatures (Elmberg et al., 2014;Tománková et al., 2013).
To summarize, this hierarchical approach estimated (1) the population trend 0 + b q(i) in each grid cell and (2)  To give more meaningful measure of the trends, we estimated the median percentage r of decrease in the population at a given latitude L over a duration D by calculating: We calculated this percentage at latitudes 60° and 46°, for durations of 1 year and 10 years.
In order to avoid convergence issues caused by correlations between the intercept and the slope, we centered year (t) at the year 2007 (middle of the study period, following the recommendation of Pinheiro and Bates (2000) to center explanatory variables before the fit). The model ran on R software (R Core Team, 2013) using the TMB package (Kristensen et al., 2016). Finally, we tested the existence of a remaining spatial autocorrelation in the random effects d i and b q(i) with a Moran test.

| RE SULTS
The strategy of standardization of sampling units through the use of grid cells proved effective to tackle the problem of spatial autocorrelation at both the grid cell and site levels (grid cell level: Moran test I = 0.060, p-value = .14; site level: Moran test I = 0.012, p-value = .33). Moreover, the random estimates of slopes for both the site and grid cell did not show any departure from normality, suggesting that the model correctly fitted the data. Finally, the plot of the residuals of the model against year did not exhibit any particular pattern, meaning that the observed trend did not depart from log-linearity.
We identified a significant negative effect of the latitude of a grid cell on the population trend ( Table 1). The decline in the number of pochards estimated using our hierarchical model (Equation 5) was greater at northern than at southern latitudes. Based on our model, we estimated a significant decrease in population size equal to 74% over the entire period for latitudes of 60° (SE = 6%; corresponding to a mean decrease of 11.6% per year, SE = 1.8% -calculation carried out within the developed R package vignette). In contrast, the decline was not significant in grid cells at 46° of latitude (mean decrease of 5.7%, SE =14%; corresponding to a mean decrease of 0.54% per year, SE = 1.4%, Figure 2).
Overall, the decline in northwestern pochard populations averaged 4.9% per year over the 11 years of the study period (SE = 0.6%), which represents a total decline of 42.3% (SE = 4%; Figure 3). Standard deviations of the random effects of slopes (Table 1)

| DISCUSS ION
In this study, we provide an example of the typical issues one may encounter when analyzing such large-scale population censuses (here IWC) and provide a general framework to tackle them. In a preliminary step, we ran analyses that allowed not only detecting spatial autocorrelation but also helped characterizing its nature, that is, spatio-temporal heterogeneity in sampling effort. In addition, these preliminary analyses showed that the decline in pochards observed during the study period did not depart from linearity. In a second step, we standardized the sampling units using a 75 × 75 km grid overlaid on the flyway of interest. This strategy proved efficient both to remove spatial autocorrelation (and therefore allowed estimating improved population trends regarding this potential source of bias) and to detect spatial (latitudinal) heterogeneity in the observed decline in the focal species in the area of interest.
Despite potential biases, one must recognize that without large- of protected areas and implementation of conservation action plans, Pavón-Jordán et al., 2020;Marchowski et al., 2020). Unfortunately, because these surveys were initiated without proper sampling strategies and in a time (1935s) when data storage, digitizing, and statistical computing were still in their infancy, they may suffer from serious weaknesses that must be addressed properly, for example, using appropriate modeling framework, before any inference.
Bayesian hierarchical modeling has previously been used to assess bird population trends (including ducks) in North America based on MWS, CBC counts, and North American Breeding Bird Surveys (Link et al., 2006;Meehan et al., 2021;Sauer & Link, 2011;Soykan et al., 2016). However, the majority of these studies did not explicitly tackle spatial autocorrelation. Note that several authors also used a grid discretization of the study area to both standardize sampling units and model the spatial and temporal autocorrelation structure in the data (Bled et al., 2013;Meehan et al., 2019) into a Bayesian hierarchical model (conditional autoregressive and CAR structure).
We did not model the autocorrelation using such models in our study because the standardization of sampling units through the grid approach proved fully efficient to remove spatial autocorrelation/structure. This does not mean, however, that a 75 × 75 km grid will prove efficient to remove spatial autocorrelation in all instances.
Should they adopt our approach, future studies may have to adjust the size of grid cells and eventually resort to approach consisting to tackle spatial structure directly into the model (Meehan et al., 2019).
Thus, turning back to the focal species, it turns out that the annual decline in populations in the northwestern flyway averaged 4.9% ± 0.6 SE per year, a value similar to those computed previously with TRIM by Nagy et al. (2014) and BirdLife International (2015) (respectively, 5.9% per year on average over 2003-2012 and 4.4% per year over 2006-2015). Therefore, our more robust analysis confirms that pochard population experienced a sharp decline in the northwestern flyway.
Although our model proved to be efficient in circumventing many limits of the dataset, this modeling approach is not the panacea and cannot be considered as a replacement for a proper sampling of the population. Indeed, if we had the means to design a proper sampling design over the continental scale, we would also probably have defined another aim to the study. More precisely, we would have designed the sampling design and protocol to allow the estimation of the trend of the whole population size over the migration corridor.
Due to the lack of common sampling strategy, we could not reach such an aim, and we designed a model allowing to estimate a spatial mean of the local trends over the corridor, which we used as a proxy of the pochard population trend. However, a spatial mean of the local trends is not exactly the same thing as the trend of the whole population size. Indeed, most of the pochard population is probably located in a restricted number of grid cells (e.g., in the Netherlands, where wetlands are numerous), whereas this spatial mean gives the same weight to all the grid cells of the corridor including those with a tiny fraction of the population. The local trends estimated in scarcely populated grid cells thus weigh too much in our global assessment.
Our estimated population trend is therefore probably biased, although the population decline observed in most of the corridor gives us confidence in our qualitative assessment of the population status.
Numerous waterbird populations have been found (or are expected) to display range shifts toward the north/northeast owing to migration short stopping (reduction in migration distances) in response to rising temperatures (Elmberg et al., 2014;Visser et al., 2009). In these species, the numbers of wintering individuals usually experience declines in southwestern parts of the wintering range Previous studies addressing the problem of short stopping in waterbirds (e.g., ducks and wading birds) either computed geographic range shifts  or based their reasoning upon crude comparisons between country in geographical changes in populations trends (Lehikoinen et al., 2013;Marchowski et al., 2020). Here, we provide a formal, reproducible framework for estimating how population trends vary over space showing that the observed decline is gradually increasing with latitude, which is similar in spirit to previous work on similar issues. We just changed the target of the inference, shifting from the estimation of a trend based on an estimate of total F I G U R E 3 Expected trend in the number of common pochards counted in the northwestern flyway computed for average sites at 46°, 50°, 55°, and 60° latitude (curve and 95% confidence intervals in blue). The green area corresponds to the 95% prediction interval on the trends at the scale of the grid cell. The red area corresponds to the 95% prediction interval on the trends at the scale of the site. The blue area corresponds to the 95% credible interval on the mean trend. Note that all confidence and prediction intervals have minimal width in 2007 because the variable "year" has been centered to allow the fit of the model (i.e., t = 0 for year 2007, see Material and Methods) relative abundance on an area (Bled et al., 2013;Sauer & Link, 2011) to the estimation of an average of trends of local relative abundance because of the impossibility for us to suppose a spatial autocorrelation in relative abundance in sampled sites for waterbirds (i.e., no reason to suppose that the abundance will be similar in two close water points with different sizes). From a conservation and management point of view, therefore, our approach provides a convenient and flexible way for accurately assessing the status of species across space and time and eventually seeking its ultimate causes.
One of the most important finding of our study, which has important biological and statistical consequences, concerns the fact that population trends varied randomly across sites, and is not correlated with the intercept of our model, that is, with the average number of animals detected in the site. The rate of decline, therefore, did not vary according to mean numbers of individuals detected in the site.
This is an interesting finding because if we suppose that the number of detected animals reflects the actual number of animals present in a site, then mean population trends can be assessed -in averageeven in sites where few animals were detected, although this would still require a large number of such sites due to the large variability in the estimated trends from one site to another. This also gives us insights into the understanding of pochard's social processes during winter, and more particularly that large aggregations are not systematically favored by the birds as is often suspected. However, we concur that our approach cannot finely assess behavioral and/or social processes and that further research is needed to assess the underlying (social or behavioral) mechanisms of group structure and size.

| CON CLUS IONS
The approach presented here may provide a convenient framework for quantitatively assessing spatial and temporal changes in population trends. However, although a significant improvement, our approach will probably never achieve the same performances (in terms of accuracy and precision) as a standardized sampling scheme. The take-home message therefore is that census schemes such as the IWC should as far as possible aim at implementing a standardized sampling strategy allowing tackling the problems of sample representativeness and spatial heterogeneity of sampling effort before data collection. Only such a strategy would provide reliable estimates of both population sizes and trends. This issue was raised a few years ago with the Midwinter Inventory in the USA (Eggeman & Johnson, 1989;Heusmann, 1999).
These surveys were abandoned in most states in favor of inventory with a standardized sampling protocol (see, e.g., Herbert et al., 2018;Masto et al., 2020;Whitaker et al., 2019).

ACK N OWLED G M ENTS
This study was supported by the Office National de la Chasse et de la Faune Sauvage (now Office Français de la Biodiversité) through a PhD grant to BF. We warmly thank Szabolcs Nagy and Tom Langendoen from Wetlands International Head Office for providing the pochard wintering census dataset. We also thank the many people who spend much time to count birds, allowing us to work over great spatial and temporal scales. Finally, we thank two anonymous reviewers and the associate editor for their valuable comments and suggestions that greatly improved the content of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors have no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
We included all the code and data used in the R package pochardTrend (Digital Object Identifier: https://zenodo.org/recor d/57105 50#. YkGTk OdBxhF) on GitHub at the following URL: https://github.com/ Cleme ntCal enge/pocha rdTrend. The reader can install this package in R using the function devtools::install_github ("ClementCalenge/ pochardTrend", ref="main"). The pochardTrend package includes a vignette describing how the user can easily reproduce the model fit (available with the command vignette ("pochardTrend"), once the package has been installed).